GSCALER: Synthetically Scaling A Given Graph
نویسندگان
چکیده
Enterprises and researchers often have datasets that can be represented as graphs (e.g. social networks). The owner of a large graph may want to scale it down to a smaller version, e.g. for application development. On the other hand, the owner of a small graph may want to scale it up to a larger version, e.g. to test system scalability. This paper investigates the Graph Scaling Problem (GSP): Given a directed graph G and positive integers ñ and m̃, generate a similar directed graph G̃ with ñ nodes and m̃ edges. This paper presents a graph scaling algorithm Gscaler for GSP. Analogous to DNA shotgun sequencing, Gscaler, decomposes G into small pieces, scales them, then uses the scaled pieces to construct G̃. This construction is based on the indegree/outdegree correlation of nodes and edges. Extensive tests with real graphs show that Gscaler is scalable and, for many graph properties, it generates a G̃ that has greater similarity to G than other state-of-the-art solutions, like Stochastic Kronecker Graph and UpSizeR.
منابع مشابه
Diameter Two Graphs of Minimum Order with Given Degree Set
The degree set of a graph is the set of its degrees. Kapoor et al. [Degree sets for graphs, Fund. Math. 95 (1977) 189-194] proved that for every set of positive integers, there exists a graph of diameter at most two and radius one with that degree set. Furthermore, the minimum order of such a graph is determined. A graph is 2-self- centered if its radius and diameter are two. In this paper for ...
متن کاملDscaler: Synthetically Scaling A Given Relational Database
The Dataset Scaling Problem (DSP) defined in previous work states: Given an empirical set of relational tables D and a scale factor s, generate a database state D̃ that is similar to D but s times its size. A DSP solution is useful for application development (s < 1), scalability testing (s > 1) and anonymization (s = 1). Current solutions assume all table sizes scale by the same ratio s. Howeve...
متن کاملGraph Hybrid Summarization
One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...
متن کاملA matrix-algebraic formulation of distributed-memory maximal cardinality matching algorithms in bipartite graphs
We describe parallel algorithms for computing maximal cardinality matching in a bipartite graph on distributedmemory systems. Unlike traditional algorithms that match one vertex at a time, our algorithms process many unmatched vertices simultaneously using a matrix-algebraic formulation of maximal matching. This generic matrix-algebraic framework is used to develop three efficient maximal match...
متن کاملEstimating Perimeter Using Graph Cuts
We investigate the estimation of the perimeter of a set by a graph cut of a random geometric graph. For Ω ⊂ D = (0, 1), with d ≥ 2, we are given n random i.i.d. points on D whose membership in Ω is known. We consider the sample as a random geometric graph with connection distance ε > 0. We estimate the perimeter of Ω (relative to D) by the, appropriately rescaled, graph cut between the vertices...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016